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Abstract 

The Steiner tree problem is a classical NP-hard optimization problem with a wide range of practical 
applications. In an instance of this problem, we are given an undirected graph G = (V,E), a set of 
terminals R C V, and non-negative costs c e for all edges e £ E. Any tree that contains all terminals is 
called a Steiner tree; the goal is to find a minimum-cost Steiner tree. The nodes V\R are called Steiner 
nodes. 

The best approximation algorithm known for the Steiner tree problem is due to Robins and Ze- 
likovsky (SIAM J. Discrete Math, 2005); their greedy algorithm achieves a performance guarantee of 
1 + w 1.55. The best known linear programming (LP)-based algorithm, on the other hand, is due to 
Goemans and Bertsimas (Math. Programming, 1993) and achieves an approximation ratio of 2 — 2/\R\. 
In this paper we establish a link between greedy and LP-based approaches by showing that Robins and 
Zelikovsky's algorithm has a natural primal-dual interpretation with respect to a novel partition-based 
linear programming relaxation. We also exhibit surprising connections between the new formulation and 
existing LPs and we show that the new LP is stronger than the bidirected cut formulation. 

An instance is b-quasi-bipartite if each connected component of G\R has at most b vertices. We 
show that Robins' and Zelikovsky's algorithm has an approximation ratio better than 1 + ^ for such 
instances, and we prove that the integrality gap of our LP is between ^ and ^tp- 

1 Introduction 

The Steiner tree problem is a classical problem in combinatorial optimization which owes its practical impor- 
tance to a host of applications in areas as diverse as VLSI design and computational biology. The problem is 
NP-hard [21], and Chlebfk and Chlebfkova show in [6] that it is NP-hard even to approximate the minimum- 
cost Steiner tree within any ratio better than ||. They also show that it is NP-hard to obtain an approximation 
ratio better than j|| in quasi-bipartite instances of the Steiner tree problem. These are instances in which no 
two Steiner vertices are adjacent in the underlying graph G. 



1.1 Greedy algorithms and r-Steiner trees 

One of the first approximation algorithms for the Steiner tree problem is the well-known minimum-spanning 
tree heuristic which is widely attributed to Moore [14]. Moore's algorithm has a performance ratio of 2 for 
the Steiner tree problem and this remained the best known until the 1990s, when Zelikovsky [41] suggested 
computing Steiner trees with a special structure, so called r-Steiner trees. Nearly all of the Steiner tree 
algorithms developed since then use r-Steiner trees. We now provide a formal definition. 
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(i) (ii) 

Figure 1: The figure shows a Steiner tree in (i) and its decomposition into full components in (ii). Square 
and round nodes correspond to Steiner and terminal vertices, respectively. This particular tree is 5-restricted. 

A full Steiner component {or full component for short) is a tree whose internal vertices are Steiner ver- 
tices, and whose leaves are terminals. The edge set of any Steiner tree can be partitioned into full compo- 
nents, by splitting the tree at terminals: see Figure 1 for an example. An r-(restricted)-Steiner tree is defined 
to be a Steiner tree all of whose full components have at most r terminals. We remark that such a Steiner 
tree may in general not exist; for example, if G is a star with a Steiner vertex at its center and more than r 
terminals at its tips. To avoid this problem, each Steiner vertex v is cloned sufficiently many times: introduce 
copies of v and connect these copies to all of v's neighbors in the graph. Copies of an edge have the same 
cost as the corresponding original edge in G. 

Let opt and opt r be the cost of an optimum Steiner tree and that of an optimal r-Steiner tree, respectively, 
for the given instance. Define the r-Steiner ratio p r as the supremum of opt r /opt over all instances of the 
Steiner tree problem. In [5], Borchers and Du provided an exact characterization of p r . The authors showed 
that p r = 1 + 0(1 /log r) and hence that p r tends to 1 as r goes to infinity. 

Computing minimum-cost r-Steiner trees is NP-hard for r > 4 [13], even if the underlying graph is quasi- 
bipartite. The complexity status for r = 3 is unresolved, and the case r = 2 reduces to the minimum-cost 
spanning tree problem. 

In [41], Zelikovsky used 3-restricted full components to obtain an 1 1 /6-approximation for the Steiner 
tree problem. Subsequently, a series of papers (e.g., [4, 20, 22, 30]) improved upon this result. These efforts 
culminated in a recent paper by Robins and Zelikovsky [34] in which the authors presented a (l + ^) ~ 
1 .55-approximation (subsequently referred to as RZ) for the r-Steiner tree problem. They hence obtain, for 
each fixed r > 2, a 1.55p r approximation algorithm for the (unrestricted) Steiner tree problem. We refer the 
reader to two surveys in [19, 31]. 

1.2 Approaches based on linear programs 

There is a large body of work on linear programming (LP)-based approximation algorithms for problems 
in combinatorial optimization. First, one finds a good LP relaxation for the problem. Then one designs 
an algorithm that produces a feasible integral solution whose cost is provably close to that of an optimum 
fractional solution for this relaxation. Many aspects of different LP relaxations for the Steiner tree problem 
have been investigated (e.g., [3, 8, 9, 10, 12, 17, 27, 38, 39]). 

Many of these LPs have been fruitfully used in integer programming-based approaches to exactly solve 
instances of up to ten thousand nodes [28]. Another common area in which LPs are useful is the design of 
polynomial time approximation algorithms via the primal-dual method (e.g., [18]). In this method, a feasible 
solution of the relaxation's LP dual is used to obtain a lower bound on the optimum cost. 

The "classical" LP-based approximation algorithms for Steiner trees [16] and forests [2] use the undi- 
rected cut relaxation [3] and have a performance guarantee of 2 — This relaxation has an integrality gap 

of 2 — t|t and the analysis of these algorithms is therefore tight. Slightly improved algorithms have since 
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been designed [23, 26] but do not achieve any constant approximation factor better than 2. 

In the special case of quasi-bipartite graphs, Rajagopalan and Vazirani [32] and Rizzi [33] obtained a | 
approximation for the Steiner tree problem in quasi-bipartite graphs. The analysis of [32] applies the primal- 
dual method to the bidirected cut relaxation [12, 39]. The bidirected cut relaxation is widely conjectured 
to have a worst-case integrality gap that is close to 1: the worst known example shows a gap of only | 
(see Section 5). Despite its conjectured strength, this new relaxation has not yet given rise to a Steiner tree 
algorithm with performance guarantee better than 2 in general graphs. 



1.3 Contribution of this paper 

In this paper we provide algorithmic evidence that the primal-dual method is useful for the Steiner tree 
problem. We first present a novel LP relaxation for the Steiner tree problem. It uses full components to 
strengthen a formulation based on Steiner partition inequalities [8]. We then show that the algorithm RZ of 
Robins and Zelikovsky can be analyzed as a primal-dual algorithm using this relaxation. We can show (see 
Section 5) that our relaxation is strictly stronger than the standard Steiner partition formulation; so the use 
of full components strengthens the partition inequalities. 

In [34], Robins and Zelikovsky showed that RZ has a performance ratio of 1.279 for quasi -bipartite 
graphs, and a performance ratio of 1.55 in general graphs. We prove a natural interpolation of these two 
results. For a Steiner vertex v, define its Steiner neighborhood S v to be the collection of vertices that are in 
the same connected component as v in G\R. A graph is b-quasi-bipartite if all of its Steiner neighborhoods 
have cardinality at most b. Note, "1-quasi-bipartite" is synonymous with "quasi-bipartite." We prove: 



Theorem 1. Given an undirected, b-quasi-bipartite graph G = 
r>2, Algorithm RZ returns a feasible Steiner tree T s.t. 

' 1.279 -opt, 
c(T)<\ (l + i)-opt r 

I (l + £ln(3-i))opt r 



(V,E), terminals R C V, and a fixed constant 



b=l 

^£{2,3,4} 
b>5. 



Unfortunately, Theorem 1 does not imply that our new relaxation has a small integrality gap. Nonethe- 
less, we obtain the following bounds, when G is ^-quasi-bipartite: 



Theorem 2. Our new relaxation has an integrality gap between | and 



2 Spanning trees and a new LP relaxation for Steiner trees 

Our work is strongly motivated by, and uses, results on the spanning tree polyhedron due to Chopra [7]. In 
this section, we first discuss Chopra's characterization of the spanning tree polyhedron; then we mention a 
primal-dual interpretation of Kruskal's spanning tree algorithm [25] based on Chopra's formulation. Finally 
we extend ideas in [8, 9] to derive a new LP relaxation for the Steiner tree problem. 



2.1 The spanning tree polyhedron 

To formulate the minimum-cost spanning tree (MST) problem as an LP, we associate a variable x e with every 
edge e G E. Each spanning tree T corresponds to its incidence vector x T , which is defined by x T e = 1 if T 
contains e and x T e = otherwise. Let IT denote the set of all partitions of the vertex set V, and suppose that 
it <G IT. The rank r(n) of it is the number of parts of it. Let E K denote the set of edges whose ends lie in 
different parts of it. Consider the following LP. 
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min £ c e x e (P SP ) 

eeE 

s.t. Xe - r ( 7r ) ~ 1 V7r G n ' 

x>0. 

Chopra [7] showed that the feasible region of (Psp) is the convex hull of all incidence vectors of spanning 
trees, and hence each basic optimal solution corresponds to a minimum-cost spanning tree. Its dual LP is 

max £(r(7r)-l)-y^ (D SP ) 
Tien 

s.t. £ y„<c e Ve£E, (1) 

7T.eeE n 

y > 0. (2) 



2.2 A primal-dual interpretation of Kruskal's MST algorithm 

Kruskal's algorithm can be viewed as a continuous process over time: we start with an empty tree at time 
and add edges as time increases. The algorithm terminates at time T* with a spanning tree of the input 
graph G. In this section we show that Kruskal's method can be interpreted as a primal-dual algorithm (see 
also [18]). At any time < T < T* we keep a pair (jc t , y T ), where x x is a partial (possibly infeasible) 0-1 
primal solution for (Psp) and y T is a feasible dual solution for (D^p). Initially, we let x e $ = for all e G E 
and y n x) = for all % G IT. 

Let G t denote the forest corresponding to partial solution x z and let E % denote its edges, i.e., E % = {e G 
E | x e>T = 1}. We then denote by n r the partition induced by the connected components of G x . At time t, the 
algorithm then increases y Kz until a constraint of type (1) for edge e G E\E Kz becomes tight. Assume that 
this happens at time z' > %. The dual update is 

y*,* = t' - T. 

We then include e in our solution, i.e., we set x ejT > = 1. If more than one edge becomes tight at time t', 
we can process these events in any arbitrary order. Thus, note that we can pick any such tight edge first 
in our solution. We terminate when G x is a spanning tree. Chopra [7] showed that the final primal and 
dual solutions have the same objective value (and are hence optimal), and we give a proof of this fact for 
completeness. 

Theorem 3. At time %*, algorithm EST finishes with a pair (x t *,Vt*) of primal and dual feasible solutions to 
(Psp) and (Djp), respectively, such that 

£ c e x e x = £ (r(n) - 1) -y nr . 
eeE neii 

Proof. Notice that for all edges e € E T * we must have c e = Y*n-.eeE % yn,x* and hence, we can express the cost 
of the final tree as follows: 

eeE x * 7t:eeE n xeYl 

By construction the set E T * C\E n has cardinality exactly r{%) — 1 for all % G n with y % jT * > 0. We obtain that 
Y*eeE c eXex = L^en( r ( 7r ) — 1) 'y%,%* an d this finishes the proof of the lemma. □ 

Observe that the above primal-dual algorithm is indeed Kruskal's algorithm: if the algorithm adds an 
edge e at time T, then e is the minimum-cost edge connecting two connected components of G x . 



4 



2.3 A new LP relaxation for Steiner trees 



In an instance of the Steiner tree problem, a partition % of V is defined to be a Steiner partition when each 
part of % contains at least one terminal. Chopra and Rao [8] introduced this notion and proved that, when x 
is the incidence vector of a Steiner tree and % is a Steiner partition, the inequality 

J> e >r(jr)-1. (3) 

eeE n 

holds. These Steiner partition inequalities motivate our approach. 

In the following we use G[U] to denote the subgraph of G induced by vertex set U, i.e., the graph with 
vertex set U and such that E(G[U]) = {uv G E(G) \ u G U, v G U}. We make the following assumptions: 

Al . G[R] is a complete graph and, for any two terminals u,v eR, c uv is the cost of a minimum-cost u, v-path 
in G. 

A2. For every Steiner vertex v and every vertex u G S V UR, uv is an edge of G, and c uv is the cost of a 
minimum-cost u, v-path in G. 

It is a well-known fact that these assumptions are w.l.o.g., i.e., any given instance can be transformed into 
an equivalent instance that satisfies these assumptions (e.g., see [36]). Note that ft-quasi-bipartiteness is 
preserved by these assumptions. 

Recall from Section 1.1 that a full component is a tree whose internal vertices are Steiner vertices and 
all of whose leaves are terminals. Also recall that a full component K is r-restricted if it contains at most 
r terminals. Further, the edge-set of any r-restricted Steiner tree T can be partitioned into r-restricted full 
components. From now on, let r > 2 be an arbitrary fixed constant. Define 

J^r := {K C R : 2 < \K\ < r and there exists a full component whose terminal set is K}. 

We note that, for each K G J^ r , we can determine a minimum-cost full component with terminal set K in 
polynomial time (e.g., by using the dynamic programming algorithm of Dreyfus and Wagner [11]). Thus, 
we can compute JSf r in polynomial time as well. 

For brevity we will abuse notation slightly and use K G J(f r interchangeably for a subset of the terminal 
set and for a particular min-cost full component spanning K. Given any r-restricted Steiner tree, we may 
assume that all of its full components are from J(f r , without increasing its cost. 

For each full component K, we use E(K) to denote its edges, V(K) to denote its vertices (including 
Steiner vertices), and ck to denote its cost. For a set 5? of full components we define E(y) := \J Kei yE(K) 
and similarly V(S fi ) := UneyV (K). By assumption Al we may assume that the full component for a terminal 
pair is just the edge linking those terminals, and by assumption A2 we may assume that any Steiner node has 
degree at least 3. We will also assume that any two distinct full components K\,K2 G Jf r are edge disjoint 
and internally vertex disjoint. This assumption is without loss of generality as each Steiner vertex in G can 
be cloned a sufficient number of times to ensure this property. Finally, we redefine G to be (V(J^),i?(J^)); 
as a result, the Steiner trees of the new graph correspond to the r-restricted Steiner trees of the original graph. 

Let J^ r (T) denote the set of all full components of a Steiner tree T. For an arbitrary subfamily 5? of the 
full components J^, our new LP uses the following canonical decomposition of a Steiner tree into elements 
of E{y) and J(f r \y. The idea, as we will explain later, is to iteratively select a "good" set 5?. 

Definition 4.I/T is an r-restricted Steiner tree, its ^-decomposition is the pair 

{E(T)nE(y),je;(T)\y). 
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Observe that after ^-decomposing a Steiner tree T we have 

£ c e + £ c K = c(T). 

eeE{T)r\E(y) KeX-{T)\y 

We hence obtain a new higher-dimensional view of the Steiner tree polyhedron. Define 

STq R := conv{x G {0, \} E ^ x {0, : 3T G ST G , S s.t. x is the incidence 

vector of the ^-decomposition of T}. 

The following definitions are used to generalize Steiner partition inequalities to use full components. We 
use X\ c/ to denote the family of all partitions of V{S P ) UR. 

Definition 5. Let % = {V\ , . . . , V p } G Tl y be a partition of the set R U V(«y). The rank contribution of full 
component K 6 J(f r \5^ is defined as 

xc\ := \ {i : K contains a terminal in VJ}| — 1. 

The Steiner rank r(n) of 71 is defined as 

r(n) := {the number of parts of K that contain terminals}. 

We describe below a new LP relaxation (P^) of ST^ S . The relaxation has a variable x e for each e G 
E(y) and a variable xk for each K G J(f r \y. For a partition n G IT 5 ^, we define E n {^) to be the edges of 
5? whose endpoints lie in different parts of n, i.e., E n (5P) =E(<9 > ) C\E n . 

min c e' x e+ X c k-xk (Pst) 

e&E(y) Ke,jer\y 

s.t £ x e + £ rc£-jc*>r(jr)-l V7T G IT r (4) 

x e ,x K >0 \/e £E (<¥>), K £X-\<y (5) 
Its LP dual has a variable y n for each partition % G TV 9 : 

max £ (r(w)-l)-y» (T>f T ) 

s.t £ ?*<c e Ve£E (6) 

£ rc n K -y K <c K VKeX-\y (7) 

^ > 0, V?r G IT*' (8) 

We conclude this section with a proof that the (primal) LP is indeed a relaxation of the convex hull of 
^-decompositions for r-restricted Steiner trees. Obviously, constraints (5) hold whenever x is the incidence 
vector of the ^-decomposition of a Steiner tree. 

Lemma 6. The inequality (4) is valid for STq R . 
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Proof. Suppose, for the sake of contradiction, that (4) is not valid for ST^ S for this %. Then there must 
exist a feasible Steiner tree T with ^-decomposition (E(T) C\E(.y),Jff r {T)\y) whose incidence vector 
x G STq R violates (4) for some partition n G YL . Choose such a partition n with smallest rank. 

Observe first that n must be a Steiner partition. Otherwise, there is a part V\ of % that contains no 
terminals. Let V2 be a part in % that contains terminals and obtain a new partition %' from % by merging V\ 
and V2. As Vi contains no terminals, we clearly have rc| = rcf for all full components K G $f r . Also, the 
Steiner rank of n and %' is the same. As e G E n i{^) implies that e G E n [5^), it follows that (4) is violated 
for %' as well and %' has smaller rank than % which contradicts our choice. 

Suppose that V(T) CRUV(y). This would mean that X-(T)\y = and in this case, Equation (3) 
implies that 

£ x e >r(n)-\. 

eeE % (,5») 

Thus, inequality (4) holds for % and x which is a contradiction. 

We may therefore assume that <% r r (T)\J?' contains some full component K. We obtain a new partition 
%' from % by merging those parts of % that contain terminals spanned by K. The rank of this new partition is 
r(n) — rc|. It follows from our choice of % that 

£ x e + £ Tc n K x K >r{n')-l=r{n)-rcl-l. 

eeE K ,(y) Ke,je r \y 

Now note that E n , (S") C E„(y) and rc|' = 0, and that rc£' < rcf for all K G J^\J^. The above inequality 
therefore implies 

£ x e + tc k x k> Xe+ H rc^x K + rc n R > r(n) -rcf- 1 +rc| 

which in turn proves that (4) holds for % and x. This contradiction completes the proof of the lemma. □ 

3 An iterated primal-dual algorithm for Steiner trees 

As described in Section 2.2, MST(G, c) denotes a call to Kruskal's minimum-spanning tree algorithm on graph 
G with cost-function c. It returns a minimum-cost spanning tree T and an optimal feasible dual solution y 
for (Dsp). Let mst (G,c) denote the cost of MST(G,c). Since c is fixed, in the rest of the paper we omit c 
where possible for brevity. Let us also abuse notation and identify each set 5? C of full components with 
the graph (y(S p ),E(S p )). 

The main idea of the greedy algorithms in [34, 40, 41] is to find a set 5? C J€ r of full components such 
that MST(^) has small cost relative to opt r . Let (2) denote the collection of all pairs of terminals. The 
algorithms all start with y = (2) and then grow 5?, so for the rest of the paper we assume that (2) C SP; 
hence E(G[R]) C E{S fi ) and/? C V(y). 

The reason that MST is useful in our primal-dual framework is that we can relate the dual program (D^p) 
on graph 5? to the dual program (D f T ). Let y be the feasible dual returned by a call to MST(^). We treat y 
as a dual solution of (D f T ) by setting each yx to zero; note that constraints (1) and (2) of (Dsp) imply that y 
also meets constraints (6) and (8) of (Pf T ). If K is a full component such that (7) does not hold for y, we say 
that K is violated by y. 

The primal-dual algorithm finds such a set y in an iterative fashion. Initially, y is equal to ( 2 ). In each 
iteration, we compute a minimum-cost spanning tree T of the graph S fi . The dual solution y corresponding to 
this tree is converted to a dual for (D^), and if y is feasible for (D^), we stop. Otherwise, we add a violated 
full component to y and continue. The algorithm clearly terminates (as J€ r is finite) and at termination, it 
returns the final tree T as an approximately-optimal Steiner tree. 
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Algorithm 1 summarizes the above description. The greedy algorithms in [34, 40, 41] differ only in how 
K is selected in each iteration, i.e., in how the selection function _/} : J(f r — > K. is defined (see also [19, §1.4] 
for a well-written comparison of these algorithms). 



Algorithm 1 A general iterative primal-dual framework for Steiner trees. 

1: Given: Undirected graph G = (V,E), non-negative costs c e for all edges e G E, constant r > 2. 

2: y° :=(f),i:= 

3: repeat 

4: (r',y) := MST(^ ! ') 

5: if y l is not feasible for (pfj) then 

6: Choose a violated full component K l G Jtf r \y i such that fi{K l ) is minimized 

7: y i+1 -y'u^} 

8: end if 

9: i:=i+l 

10: until y 1 is feasible for (p fj ) 

11: Let p = / - 1 and return {T p ,y p ). 



The following lemma is at the heart of our proof, and explains why our LP can be used to find cheap 
Steiner trees. 

Lemma 7. Let (T,y) = MST(^) and suppose that K is violated by y. Then adding K to ^ produces a 
cheaper spanning tree, i.e., 

mst(yu {K}) <c(T). 

Proof. Assume that MST(o5^) finishes at time T* and, once again, let % % be the partition maintained by 
Kruskal's algorithm at time < X < x*. 

Define q = rc* to be the rank-contribution of K with respect to the initial partition. Clearly, rc^ 1 * = as 
all terminals are contained in the same connected component at time t*. Then there are edges e\, . . . ,e q G T 
such that, for 1 < i < q, the rank-contribution of K with respect to the partition maintained by Kruskal's 
algorithm drops from q — i + l to q — i when edge <?,• is added. Formally, for 1 < i < q, let 7T; and n\ be the 
partition maintained by Kruskal's algorithm before and after adding edge e u then 

rc2=rcj + l. 

We denote the time of addition of edge e, by T; for all i. 

From the description of Kruskal's algorithm it follows that 

q q „ T * 

E c «i = E T i = / r % T ^ 

and the right-hand side of this equality is equal to Y*nen y rc ^3 ; ^- The fact that constraint (7) is violated for 
K therefore implies that 

c ei H Yc eq > c K . 

Finally observe that TUE(K) \{ei,... ,e q } is a spanning tree of yu{K} and its cost is smaller than that of 
T. □ 
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Figure 2: The figure shows the Steiner tree instance from Figure 1 with costs on the edges. The loss of the 
Steiner tree in this figure is shown in thick edges. Its cost is 8. 



3.1 Cutting losses: the RZ selection function 

A potential weak point in Algorithm 1 is that once a full component is added to .5? , it is never removed. On 
the other hand, if some cheap subgraph H connects all Steiner vertices of 5? to terminals, then adding H to 
any Steiner tree gives us a tree that spans V(y), i.e., we have so far lost at most c(H) in the final answer. 
This leads to the concept of the loss of a Steiner tree which was first introduced by Karpinski and Zelikovsky 
in [22]. 

Definition 8. Let G 1 = (V',E') be a subgraph ofG. The loss L(G') is a minimum-cost set E" C E' such that 
every connected component of(V',E") contains a terminal. Let l(G') denote the cost ofL(G'). 

See Figure 2 for an example of the loss of a graph. The above discussion amounts to saying that 
min{mst(^") | 5?' 5 ^} < opt r + l(^). Consequently, our selection function /; in step 6 of the algo- 
rithm should try to keep the loss small. The following fact holds because full components in meet only 
at terminals. 

Fact 9. Ify C X r , then L(y) = U Ke yL(K) and so l(J^) = ^Ke^ l(K). 

For a set y of full components, where y is the dual solution returned by MST(o5^), define 

Eit(^) := £ (f (jf) - \)y n . (9) 

If y is feasible for (Df T ) then by weak LP duality, mst(^) provides a lower bound on opt r If y is infeasible 
for (J) f T ), then which full component should we add? Robins and Zelikovsky propose minimizing the ratio 
of the change in upper bound to the change in potential lower bound (9). Their selection function /; is defined 
by 

f.(K\ ■= HQ = 1(^'UW)-1(^0 nm 

M >' 5^t(^)-Sit(^ ! 'U{^}) 5^t(^'')-5^t(^ ! 'U{^})' 

where the equality uses Fact 9. 



4 Analysis 

Fix an optimum r-Steiner tree T*. There are several steps in proving the performance guarantee of Robins 
and Zelikovsky's algorithm, and they are encapsulated in the following result, whose complete proof appears 
in Section 6. 

Lemma 10. The cost of the tree T p returned by Algorithm 1 is at most 

mst(G[/?],c) — opt r 



opt r + l(r)-ln(l + : if7 , | 
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The main observation in the proof of the above lemma can be summarized as follows: from the discussion 
in Section 2, we know that the tree T p returned by Algorithm 1 has cost 

mst(J^) = £ (r(TT)-lK 

and the corresponding lower-bound on opt r returned by the algorithm is 

Eit(J^) = £ (r(n)-\)y p n . 

We know that mst(^ p ) < opt r but how large is the difference between mst(^ p ) and mst(^ p )? We show 
that the difference 

£ {r{n)-r{n))y p n 

is exactly equal to the loss l(T p ) of tree T p . We then bound the loss of each selected full component K l , and 
putting everything together finally yields Lemma 10. 

The following lemma states the performance guarantee of Moore's minimum-spanning tree heuristic as 
a function of the optimum loss and the maximum cardinality b of any Steiner neighborhood in G. 

Lemma 11. Fix an arbitrary optimum r-restricted Steiner tree T*. Given an undirected, b-quasi-bipartite 
graph G = (V,E), a set of terminals RQV, and non-negative costs c e for all e G E, we have 

mst(G[R],c) <2opt r -jl(T*) 
b 

for any b>\. 

Proof. Recall that Jtf r (T*) is the set of full components of tree T*. Now consider a full component K G 
J^(T*). We will now show that there is a minimum-cost spanning tree of G[K] whose cost is at most 
2c K — j;l(K). By repeating this argument for all full components K G J(f r {T*), adding the resulting bounds, 
and applying Fact 9, we obtain the lemma. 

For terminals r,s G K, let P rs denote the unique r^-path in K. Pick u,v G K such that c(P uv ) is maximal. 
Define the diameter A(K) := c(P uv ). Do a depth-first search traversal of K starting in u and ending in v. The 
resulting walk in K traverses each edge not on P uv twice while each edge on P uv is traversed once. Hence 
the walk has cost 2ck — A(K). Using standard short-cutting arguments it follows that the minimum-cost 
spanning tree of G[K] has cost at most 

2c K -A(K) (11) 

as well. 

Each Steiner vertex s G V(K)\R can connect to some terminal v G K at cost at most Hence, the cost 

l(K) of the loss of K is at most b^p-. In other words we have A(K) > Plugging this into (1 1) yields 

the lemma. □ 

For small values of b we can obtain additional improvements via case analysis. 

Lemma 12. Suppose b G {3,4}. Fix an arbitrary optimum r-restricted Steiner tree T*. Given an undirected, 
b-quasi-bipartite graph G = (V,E), a set of terminals fiCF, and non-negative costs c e for all e G E, we 
have 

mst(G[R],c) < 2opt r -l(r*). 
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(i) 



(ii) 



Figure 3: The figure shows the two types of full components when b < 4. On the left is a full component 
where the Steiner nodes form a path, and on the right is a full component where the Steiner nodes form a star 
with 3 tips. 



Proof. As in the proof of Lemma 1 1 it suffices to prove that, for each full component K € J^ r (T*), there is a 
minimum-cost spanning tree of G[K] whose cost is at most 2c k — l(K), for then we can add the bound over 
all such K to get the desired result. For terminals r,s € K, let P rs again denote the unique r, s-path in K. 

Notice that the Steiner nodes (there are at most b of them) in the full component K either form a path, or 
else there are 4 of them and they form a star. 

Case 1: the Steiner nodes in K form a path. Let x and y be the Steiner nodes on the ends of this path. Let u 
(resp. v) be any terminal neighbour of x (resp. y); see Figure 3(i) for an example. Perform a depth-first 
search in K starting from u and ending at v; the cost of this search is 2ck — c(P uv ). By standard short- 
cutting arguments it follows that 2ck — c(P uv ) is an upper bound on mst(G[K]). On the other hand, 
since P uv \{ux} is a candidate for the loss of K, we know that l(K) < c(P uv \{ux}) < c(P uv ). Therefore 
we obtain 



Case 2: the Steiner nodes in K form a star. Let the tips of the star be x,y,z and let t,u,v be any terminal 
neighbours of x,y,z respectively; see Figure 3(ii) for an example. Without loss of generality, we may 
assume that c xt < c yu < c zv . As before, a depth-first search in K starting from u and ending at v has 
cost 2c k — c{P uv ) and this is an upper bound on mst(G[K]). On the other hand, P uv \{yu} U {xt} is a 
candidate for the loss of K and so l(K) < c(P uv ) — c yu + c xt < c(P uv ). We hence obtain Equation (12) 
as in the previous case. □ 

We are ready to prove our main theorem. We restate it using the notation introduced in the last two 
sections. 

Theorem 1. Given an undirected, b-quasi-bipartite graph G = (V,E), terminals RC.V, and a fixed constant 
r>2, Algorithm 1 returns a feasible Steiner tree T p with 



mst(G[K])<2c K -c{P uv )<2c K -l(K). 



(12) 
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Proof. Using Lemma 10 we see that 

/ „x , *x / mst(G[/?l,c) — opt, 

c{T p ) < opt r + l(r*)-ln 1 + — 



i(r*) 

optr + 1(r) . ta ( 1 + !«to). (13) 



The second equality above holds because G[R] has no Steiner vertices. Applying the bound on mst(G[/?],c) 
from Lemma 1 1 yields 



1 + ^.^*+^ 



(14) 



opt, v b H 7 *) 

Karpinski and Zelikovsky [22] show that l(T*) < ^opt r . We can therefore obtain an upper-bound on the 
right-hand side of (14) by bounding the maximum value of function xln(l — 2/b+l/x) for x € [0, 1 /!}. We 
branch into cases: 

b=l: The maximum of x In (1/jc— 1) forxG [0,1/2] is attained for x « 0.2178. Hence, xln(l/x- 1) <0.279 
for xe [0,1/2]. 

b = 2: The maximum of x\n(l/x) is attained for x = l/e and hence jcln(l/jc) < l/e for x G [0, 1/2]. 

b e {3,4}: We use Equation (13) together with Lemma 12 in place of Lemma 11; the subsequent analysis 
and result are the same as in the previous case. 

b > 5: The function xln(l —2/b+l/x) is increasing in x and its maximum is attained for x = 1 /2. Thus, 
xln(l -2/b+l/x) < ±ln(3-2/fc) for x € [0, 1 /2] . 

The three cases above conclude the proof of the theorem. □ 



5 Properties of (P ST ) 

In this section, we first prove that the linear program (P ST ) is gradually weakened as the algorithm progresses 
(i.e., as more full components are added to S?). Then we describe bounds on the integrality gap of the new 
LP, and its strength compared to other LPs for the Steiner tree problem. 

Lemma 13. If y C S"', then the integrality gap of (Pf T ) is at most the integrality gap of (P'f T ). 

Proof. We consider only the case where S?' = .5? U {/} for some full component /; the general case then 
follows by induction on \ J?"\y\. 

Let x be any feasible primal point for (P^) and define the extension x' of x to be a primal point of (P^. ), 
with x' e = xj for all e£E(J) and x' z = x z for all Z G (X r \9") UE(y). We claim that x 1 is feasible for {vfj). 
Since x and xf have the same objective value, this will prove Lemma 13. 

It is clear that x 1 satisfies constraints (5), so now let us show that x 1 satisfies the partition inequality (4) in 
(Pf T ). Fix an arbitrary partition %' of V{S f "), and let n be the restriction of %' to V{5?). We get 

£ x' e + £ rc* K x' K =( £ x e + £ Tclx K \+\E n ,r\E(J)\xj-Tc n jXj. (15) 

eeE n ,(y) KeX-\y \eeEx(y) K&x r \y J 

Now J spans at least rcj + 1 parts of %' , and it follows that \E n i HE (J) \ > rc]. Hence, using Equation (15), 
the fact that x satisfies constraint (4) for n, and the fact that r(n) = r(n r ), we have 

£ x' e + £ rc^> £ x e + £ rc^>r(w)- 1=^-1. 

ee£„/(^") KeX\y eeE^y) KeJT r \y 

So x' satisfies (4) for n' . □ 
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Figure 4: Skutella's example, which shows that the bidirected cut formulation and our new formulation both 
have a gap of at least I. The shaded edges denote one of the quasi-bipartite full components on 5 terminals. 

In 1997, Warme [37] introduced a new linear program for the Steiner tree problem. He observed (as did 
the authors of [30] in the same year) that full components allow a reduction from the Steiner tree problem to 
the spanning-tree-in-hypergraph problem. He also gave an LP relaxation for spanning trees in hypergraphs. 
That LP turns out to be exactly as strong as our own LP; see [24, Corollary 3.19] for a proof. Now, Polzin 
et al. [29] proved that Warme's relaxation is stronger than the bidirected cut relaxation, and Goemans [15] 
proved that the (graph) Steiner partition inequalities are valid for the bidirected cut formulation. Hence, as 
stated previously, using full components as in (Pf T ) strengthens the Steiner partition inequalities. 

5.1 A lower bound on the integrality gap of (P 5T ) 

Note that when 5? = (f), (P® T )and (P^) are equivalent LPs: for each terminal-terminal edge uv, the full 
component variable Xi UjV \ of the former corresponds to the edge variable x uv of the latter. Hence although we 
consider the simpler LP (P^) in this section, the results apply also to the LP used in the first iteration of RZ. 

Goemans [1] gave a family of graphs upon which, in the limit, the integrality gap of the bidirected cut 
relaxation is I. Interestingly, it can be shown that once you preprocess these graphs as described in Section 
2.3, the gap completely disappears. Here we describe another example, due to Skutella [35]. It shows 
not only that the gap of the bidirected cut relaxation is at least I, but that the gap of our new formulation 
(including preprocessing) is at least %. The example is quasi-bipartite. 

The Fano design is a well-known finite geometry consisting of 7 points and 7 lines, such that every point 
is on 3 lines, every line contains 3 points, any two lines meet in a unique point, and any two points lie on a 
unique common line. We construct Skutella's example by creating a bipartite graph, with one side consisting 
of one node n p for each point p of the Fano design, and the other side consisting of one node ng for each line 
I of the Fano design. Define n p and ni to be adjacent in our graph if and only if p does not lie on I. Then it 
is easy to see this graph is 4-regular, and that given any two nodes n\,ri2 from one side, there is a node from 
the other side that is adjacent to neither n\ nor ni. Let one side be terminals, the other side be Steiner nodes, 
and then attach one additional terminal to all the Steiner nodes. We illustrate the resulting graph in Figure 4. 

Each Steiner node is in a unique 5-terminal quasi-bipartite full component. There are 7 such full compo- 
nents. Denote the family of these 7 full components by ^ . 

Claim 14. Let x* K = \for each K € c €, andx\ = otherwise. Then x* is feasible for (P^). 

Proof. It is immediate that x* satisfies constraints (5). It remains only to show that x* meets constraint (4). 
Let 71 be an arbitrary partition, with parts Tto, . . . , % m such that 71q contains the extra "top" terminal. If we can 
show that Y,K x K rc K — m ^ en we wm b e done, since 71 was arbitrary. For each i = 1, . . . ,m, let r, be any 
terminal in 71,-. Note that each r,- lies in exactly 4 full components from c €. Furthermore, every full component 
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K € ^ satisfies xc\ > \Kd {n, . . . ,r m }\, since that full component meets 7To as well as each part 7lj such that 
rj G K. Hence 

11 1 m 1 

E« = Z E rc|>- £#{y: G^} = -£#{^G^: G^} = --m-4 = m. □ 

The objective value of x* is ^, but the optimal integral solution to the LP is 10, since at least 3 Steiner 
nodes need to be included. Hence, the gap of our new LP is no better than -^fj^ = |. 

5.2 A gap upper bound for ^-quasi-bipartite instances 

In [32] Rajagopalan and Vazirani show that the bidirected cut relaxation has a gap of at most |, if the graph 
is quasi -bipartite. Since (P® T ) is stronger than the bidirected cut relaxation its gap is also at most | for such 
graphs. We are able to generalize this result as follows. 

Theorem 2. On b-quasi-bipartite graphs, (P® T )has an integrality gap between | and in the worst case. 

Proof. The lower bound comes from Section 5.1. We assume G is ^-quasi-bipartite, we let T* be an optimal 
Steiner tree, and we let y* be its set of full components. Since T* is a minimum spanning tree for y*, 
there is a corresponding feasible dual y for (Dsp)- When we convert y to a dual for (D' ST ), we claim that y 
is feasible: indeed, by Lemma 7 a violated full component could be used to improve the solution, but T* is 
already optimal. The next lemma is the cornerstone of our proof. 

Lemma 15. Let n be a partition ofV{S fi *) with y n > 0. Then {?(%) — 1) > ji^[{r{x) - 1). 

Proof. For each part %i of %, let us identify all of the nodes of tt, into a single pseudonode v ; . We may assume 
by Theorem 3 that each T* [nil is connected, hence this identification process yields a tree T . Let us say that 
v, is Steiner if and only if all nodes of 7% are Steiner. Note that T 1 has r{n) pseudonodes and r(n) — r{%) of 
these pseudonodes are Steiner. The full components of T' are defined analogously to the full components of 
a Steiner tree. 

Consider any full component K' of T' and let K' contain exactly s Steiner pseudonodes. It is straight- 
forward to see that s < b. Each Steiner pseudonode in K' has degree at least 3 by Assumptions Al and 
A2, and at most s — 1 edges of K' join Steiner vertices to other Steiner vertices. Hence K' has at least 
3s — (s — 1 ) = 2s + 1 edges, and so 

iev«*m ^ 2s + l ^ 2b + l 
\E(K )| > s > s. 

s b 

Now summing over all full components K', we obtain 

> - • #{Steiner pseudonodes of T'\. 
b 

But \E(T')\ = r(n) — 1 and T' has r{n) — r(n) Steiner pseudonodes, therefore 

r(*)-l>^±l((r(»)-l)-(r(*)-l)) =► ?*±i(r(*)- 1) > *±l(r(*)- 1). 
This proves what we wanted to show. □ 
It follows that the objective value of y in {Df T ) is 

I {f{ii)-\)y n > £ ^±L(^)-i)^ = ^±L c( r) 

nen*' Tien* LD ^ 1 LD ^ 1 



and since T* is an optimum integer solution of (P 5r ) , it follows that the integrality gap of (Pf T ) is at most 

b+\ 
2b+l 



Then, finally, by applying Lemma 13 to (P^) and (P^*) we obtain Theorem 2. □ 
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6 Proof of Lemma 10 



In this section we present a proof of Lemma 10. The methodology follows that proposed by Gropl et al. [19]. 
In fact, many of the proofs below essentially correspond to those presented in [19] with two exceptions: we 
correct a small error near the end, and we present a new proof of the ubiquitous contraction lemma. 

We remind the reader of our standing assumption that ^2 ( 2 J- We first relate the cost of a minimum- 
cost spanning tree of 5? for some set 5? of full components to the (potential) lower-bound mst(^) on opt,, 
that it provides. For ease of presentation in the analysis, we will assume from now on that the costs of all 
edges in E are pairwise different. This assumption is easily seen to be w.l.o.g. (e.g., one could define an 
order on the edges in E and use it to break ties). We omit the proof of the following easy fact. 

Fact 16. IfT is a minimum-cost spanning tree of 5^ then l(T) = l(^). 

Lemma 17. For any set 5? C of full components, 

mst(J^) = ^st(y) + l(y). 

Proof. We use the notation from Section 2: x* is the finishing time of Kruskal's algorithm, G x = (V,E X ) 
is the forest maintained at time x, and % x is the partition induced by the connected components of G x . Let 
(T,y) denote the tree-dual pair returned by MST. 

From Theorem 3 we know that there exists a feasible dual solution y to (Psp) for graph 5? such that 

c(T)= £ {r{n)-l)y n = f {r{% x )-\)dx. 

In the following let £% x be the set of those connected components of E x that contain terminal vertices. 

Claim 18. For all < x < T*, each connected component of ' E x UL(T) contains exactly one connected 
component of & x . 

Proof. Let u and v be terminals in distinct connected components of G x and let P uv be the unique u, v-path in 
T. Assume for the sake of contradiction that P uv is contained in E x U L(T). 

Let e be the unique edge of maximum cost on path P uv . Recall from Section 2 that Kruskal's algorithm 
adds edges to the partial spanning tree in order of non-decreasing cost. Thus, edge e is added last among all 
edges on P uv . As u and v are in different connected components of G T , it therefore follows that e $ E T . The 
loss of T is a minimum-cost forest in T that connects all Steiner vertices to terminals. Thus, the unique edge 
of maximum cost on P uv cannot be in L(T). 

It follows that e E T U L(T) and this contradicts our assumption that P uv C E x U L(T). □ 

For each time < x < T*, define 7r T as the Steiner partition corresponding to the connected components 
of G T U L(r). From Theorem 3 we know that 

1(T)= £ c e = £ Z y*= T \E*r\L(T)\dx 

eeh(T) eeL(T)7T.eeE n jQ 

where, as before, E^ is the set of edges in E that have endpoints in different parts of K x . 

The number of edges in \E„ z l~lL(r)| is exactly the rank-difference between n x and K x and hence 

1(T)= f (r(n x )-r(Ti x ))dx. 
Jo 
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Claim 18 implies that r{n x ) = r(n r ) for all < X < X* and hence 

5it(^) + l(T) = / (r{n)-\)dx+ (r(n,)-r(n,))dx= (r(jr) - \)dx = c(T). 
Jo Jo Jo 

Applying Fact 16 and the equality c(T) = mst(^ 7 ), we are done. □ 

We obtain the following immediate corollary: 

Corollary 19. In iteration i of Algorithm 1, adding full component K £ ,J€ r to 5? reduces the cost of 'mst(^) 
if and only iff(K) < 1. 

Proof. By applying Lemma 17 we see that 

mst(J^) - mst (J?"' U {K}) = Eit (.*"') + - iit(J^' U {K}) - U {K}). 

Whereas the left-hand side is positive iff adding K to S? x causes a reduction in mst, the right-hand side is 
positive iff f{K) < 1, due to the definition of □ 

Using Lemma 7 and Corollary 19, we obtain the following. 

Corollary 20. For all \<i<p, fi{K l ) < 1. 

Fix an optimum r-Steiner tree T*. The next two lemmas give bounds that are needed to analyze RZ's 
greedy strategy. Informally, the first says that mst is non-increasing, while the second says that mst is 
submodular. 

Lemma 21. If y cy'CJf r , then iit(^') < Hit(j^). 
Proof. Using Lemma 17 and Fact 9 we see 

Eit(j^) -isT(j^') = mst(j^) + l(S"\S f ) -mst(y'). 

However, the right hand side of the above equation is non-negative, as MST(^) \JL(j7"\y) is a spanning 
tree of Lemma 21 then follows. □ 

Lemma 22 (Contraction Lemma). Let &°,& l ,& 2 C Jf r be disjoint collections of full components with 
(*) C <5?°. Then 

Hit - Hit U ^ 2 ) > Est (^° U & ) - Hit (<# U & U @ 2 ) . 

Proof. The statement to be proved is equivalent to 

mst (^°) - mst U @ 2 ) > mst U & ) - mst U & U 3> 2 ) , (16) 

due to Lemma 17 and Fact 9. For a graph H, define the rank r(H) of H as the number of edges in a maximal 
forest of H: 

r(H) = \V(H) \ — # connected components of H. 

For a graph H, let H< x denote the subgraph of H consisting of those edges of weight at most x. By considering 
Kruskal's algorithm, for any graph H having nonnegative edge costs, we see that 

r(H) oo 

mst(tf) = £ min{x | r(H< x ) >i}= / (r(H) - r{H< x )) dx. (17) 
(=1 Jo 
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Note that the integral is proper since the integrand is for x larger than max{c e : e G E(H)}. 

Here is the crux: r is the rank function for a (graphic) matroid and is therefore submodular over the 
addition of disjoint edge sets. Since the &' <x are pairwise disjoint, for every x, this submodularity implies 
that 

-r(^) +r(^ x U^y > -r (^U^) + r U ^ U ^) . (18) 

Notice also that 

r(^°) - r(^° U ,« 2 ) = r(^° U & ) - r(^° U ^ U ^ 2 ) (19) 

since both sides are equal to the number of Steiner vertices in ^ 2 , times —1. 

Finally, we add Equation (18) to Equation (19) and integrate along x. Since (f°U^ 2 )<j = &% X U&^ X 
etc. we get 

jT (r(^°) - r{0Qj) dx - J~ {r{0P U Sf) - r U 3? 2 )< x ) ) dx 

> jf° (r(^° U & ) - r ((^° U & )< x ) ) - jT (r(^° U U ^ 2 ) - r ((^° U ^ U ^ 2 )< x ) ) dx. 

But using Equation (17), this gives precisely Equation (16). □ 

We note that the proof of Lemma 21 easily generalizes to other matroids. This is a departure from the 
existing proofs in [19] and [4, Lemma 3.9], and Rizzi's more specific result [33, Lemma 2], although a strong 
exchange property of matroids is used in the proof of [4]. 

We are finally near the end of the analysis, where the Contraction Lemma comes into play. We can now 
bound the value fi{K l ) for all < i < p— 1 in terms of the cost of T*'s loss. In the remainder of the section, 
let the full components of T* be K*> 1 ,. . . ,K*'i, let 1* denote 1(7/*), let Est'' denote Est(^') and let Est* 
denote mst(r*). 

Lemma 23. For allO<i<p-\, if mst 1 -Est* > 0, then f{K v ) < l*/(mst'' -Est*). 

Proof. By the choice of K' in Algorithm 1, we have f(K l ) < mmj f(K*^). A standard fraction averaging 
argument implies that 

1(K*-J) 

f(K') < 



£j =1 (Hit (y { ) - Eit(J^ U {K*-J})) 

1* 

- Z q j= i (mst(«^"' U {K*' 1 , . . . ,K*>j- 1 }) -Eit(^ U {K*-\. . . ,K*<J})) ^ 

where the last inequality uses Fact 9 and Lemma 22. (Additional care is needed when T* and y p overlap 
in some full components, but the above inequalities still hold.) The denominator of the right-hand side 
of Equation (20) is a telescoping sum. Canceling like terms, and using Lemma 21 to replace mst (y U 
{K*'\ . . . ,K*'i}) with EsT*, we are done. □ 

We can now bound the cost of T p . 

Proof of Lemma 10. We first bound the loss l(T p ) of tree T p . Using Fact 9, 

= = I/K*') • (iit 1 -5it m ) (21) 

i=0 i=0 
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where the last equality uses the definition of from (10). Using Corollary 20 and Lemma 23, the right hand 
side of Equation (21) is bounded as follows: 

£/;0n • (iit 1 -iit !+1 ) < P £ j\ - • (5it ! -5it'' +1 ). (22) 

i= i= o max{l*,mst —mst } 

The right hand side of Equation (22) can in turn be bounded from above by the following integral: 
p ^! 1*-(5sT / -EsT , ' +1 ) 1* fSSP-SST !* 

ax. (23) 



W 1 • mst -mst' T1 r st 1* f 

) : — — < / ,,, dx = . 

~o max{l*,mst ! -mst*} Jwst p max{l*,x-mst } Jiit"-5it* max{l*,x} 

Notice that mst = mst (G[R],c) > opt r = 1* +mst*. The termination condition in Algorithm 1 and Lemma 
6 imply that mst p < opt r . Hence the result of evaluating the integral in the right-hand side of Equation (23) 
is 

/■mst -mst* 1 / JfT^F _ 5fit* \ 

l*-(5st p -5st*) + l*-y -Jx = opt r -Hst / ' + l*-lnl J (24) 

where the equality uses Lemma 17. Applying Lemma 17 two more times, and combining Equations (21)- 
(24), we obtain 



c(T p ) =m^t p + l(T p ) < opt r + l*-ln 



/ ^ 

mst —mst 



opt r + l* -In 1 + 



^ mst — (mst* + 1*) 



* , / mst - opt r 
= opt r + l*-ln 1 + — 



as wanted. □ 

Remark. Gropl et al. essentially prove Lemma 10 in [19, Lemma 4.3] but a minor error lies in their 
equation "(18)." Namely, they assume "m, — m*> 0" which is mst ! — mst* > in our notation. 
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